Skip to content

Conversation

@tashiro-akira
Copy link
Contributor

@AkiraUra @kimusaku
Fix target column names to revert after removing special characters.
This fix is necessary to address the preprocess issue sapientml/preprocess#8.
Please check.

@tashiro-akira tashiro-akira requested a review from a team as a code owner December 12, 2023 06:48
@tashiro-akira tashiro-akira requested review from fukuta-flab and kimusaku and removed request for a team December 12, 2023 06:48
tashiro akira added 2 commits December 12, 2023 15:50
Copy link
Contributor

@AkiraUra AkiraUra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you consider my comments?

Comment on lines 222 to 248
pipeline.validation = code_block.validation + pipeline.validation
pipeline.test = code_block.test + pipeline.test
if "cols_has_symbols" in pipeline.test:
addindex = pipeline.test.index("perm_df = pd.DataFrame")
pipeline.test = (
pipeline.test[:addindex]
+ "feature_train_csv = feature_train.rename(columns=rename_symbol_cols)\n "
+ pipeline.test[addindex:]
)
addindex = pipeline.test.index("prediction = pd.DataFrame")
pipeline.test = (
pipeline.test[:addindex]
+ "TARGET_COLUMNS_csv = [rename_symbol_cols[TARGET_COLUMNS[0]]]\n"
+ pipeline.test[addindex:]
)
else:
addindex = pipeline.test.index("perm_df = pd.DataFrame")
pipeline.test = (
pipeline.test[:addindex] + "feature_train_csv = feature_train\n " + pipeline.test[addindex:]
)
addindex = pipeline.test.index("prediction = pd.DataFrame")
pipeline.test = (
pipeline.test[:addindex] + "TARGET_COLUMNS_csv = [TARGET_COLUMNS[0]]\n" + pipeline.test[addindex:]
)

pipeline.train = code_block.train + pipeline.train
pipeline.predict = code_block.predict + pipeline.predict
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about the following code?

            if "cols_has_symbols" in pipeline.test:
                pipeline.test = pipeline.test.replace('perm_df.to_csv', 'perm_df.rename(columns=rename_symbol_cols).to_csv')
                def replace_targets(match_obj):
                    return match_obj[0].replace('TARGET_COLUMNS', "[rename_symbol_cols.get(v, v) for v in TARGET_COLUMNS]")
                pat = r"prediction = pd.DataFrame\(y_prob, columns=.?TARGET_COLUMNS.*, index=feature_test.index\)"
                pipeline.test = pat.sub(replace_targets, pipeline.test)
                pipeline.predict = pat.sub(replace_targets, pipeline.predict)
            pipeline.validation = code_block.validation + pipeline.validation
            pipeline.test = code_block.test + pipeline.test
            pipeline.train = code_block.train + pipeline.train
            pipeline.predict = code_block.predict + pipeline.predict

Copy link
Contributor Author

@tashiro-akira tashiro-akira Jan 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AkiraUra
When I modified the code as described in the comments, the output csv column names were not returned from the modified state. (I also modified preprocess.)

The string substitution method of the indicated code is shorter than the previous code, so I will try to fix it referring to this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these changes are necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these changes are necessary.

@tashiro-akira
Copy link
Contributor Author

Reflected the content of the review.
Undid unnecessary modifications.

@kimusaku kimusaku enabled auto-merge (squash) February 6, 2024 07:38
Signed-off-by: tashiro akira <[email protected]>
auto-merge was automatically disabled March 22, 2024 07:16

Head branch was pushed to by a user without write access

tashiro akira added 2 commits March 22, 2024 16:24
@tashiro-akira tashiro-akira requested a review from AkiraUra April 4, 2024 02:59
Copy link
Contributor

@AkiraUra AkiraUra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR. The code of rename_dict is duplicated from the preprocess one.
Could you consider a way of removing the duplication?

@tashiro-akira
Copy link
Contributor Author

Changed branches and issued new pull requests to capture core and preprocess integration fixes.

The correspondence of the comment is also described in the new pull request.
PR:#78

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants